19 research outputs found
Depth CNNs for RGB-D scene recognition: learning from scratch better than transferring from RGB-CNNs
Scene recognition with RGB images has been extensively studied and has
reached very remarkable recognition levels, thanks to convolutional neural
networks (CNN) and large scene datasets. In contrast, current RGB-D scene data
is much more limited, so often leverages RGB large datasets, by transferring
pretrained RGB CNN models and fine-tuning with the target RGB-D dataset.
However, we show that this approach has the limitation of hardly reaching
bottom layers, which is key to learn modality-specific features. In contrast,
we focus on the bottom layers, and propose an alternative strategy to learn
depth features combining local weakly supervised training from patches followed
by global fine tuning with images. This strategy is capable of learning very
discriminative depth-specific features with limited depth images, without
resorting to Places-CNN. In addition we propose a modified CNN architecture to
further match the complexity of the model and the amount of data available. For
RGB-D scene recognition, depth and RGB features are combined by projecting them
in a common space and further leaning a multilayer classifier, which is jointly
optimized in an end-to-end network. Our framework achieves state-of-the-art
accuracy on NYU2 and SUN RGB-D in both depth only and combined RGB-D data.Comment: AAAI Conference on Artificial Intelligence 201
Joint Learning of CNN and LSTM for Image Captioning
Abstract. In this paper, we describe the details of our methods for the participation in the subtask of the ImageCLEF 2016 Scalable Image Annotation task: Natural Language Caption Generation. The model we used is the combination of a procedure of encoding and a procedure of decoding, which includes a Convolutional neural network(CNN) and a Long Short-Term Memory(LSTM) based Recurrent Neural Network. We first train a model on the MSCOCO dataset and then fine tune the model on different target datasets collected by us to get a more suitable model for the natural language caption generation task. Both of the parameters of CNN and LSTM are learned together
Towards Domain-Specific Knowledge Graph Construction for Flight Control Aided Maintenance
Flight control is a key system of modern aircraft. During each flight, pilots use flight control to control the forces of flight and also the aircraft’s direction and attitude. Whether flight control can work properly is closely related to safety such that daily maintenance is an essential task of airlines. Flight control maintenance heavily relies on expert knowledge. To facilitate knowledge achievement, aircraft manufacturers and airlines normally provide structural manuals for consulting. On the other hand, computer-aided maintenance systems are adopted for improving daily maintenance efficiency. However, we find that grass-roots engineers of airlines still inevitably consult unstructured technical manuals from time to time, for example, when meeting an unusual problem or an unfamiliar type of aircraft. Achieving effective knowledge from unstructured data is inefficient and inconvenient. Aiming at the problem, we propose a knowledge-graph-based maintenance prototype system as a complementary solution. The knowledge graph we built is dedicated for unstructured manuals referring to flight control. We first build ontology to represent key concepts and relation types and then perform entity-relation extraction adopting a pipeline paradigm with natural language processing techniques. To fully utilize domain-specific features, we present a hybrid method consisting of dedicated rules and a machine learning model for entity recognition. As for relation extraction, we leverage a two-stage Bi-LSTM (bi-directional long short-term memory networks) based method to improve the extraction precision by solving a sample imbalanced problem. We conduct comprehensive experiments to study the technical feasibility on real manuals from airlines. The average precision of entity recognition reaches 85%, and the average precision of relation extraction comes to 61%. Finally, we design a flight control maintenance prototype system based on the knowledge graph constructed and a graph database Neo4j. The prototype system takes alarm messages represented in natural language as the input and returns maintenance suggestions to serve grass-roots engineers
Towards Domain-Specific Knowledge Graph Construction for Flight Control Aided Maintenance
Flight control is a key system of modern aircraft. During each flight, pilots use flight control to control the forces of flight and also the aircraft’s direction and attitude. Whether flight control can work properly is closely related to safety such that daily maintenance is an essential task of airlines. Flight control maintenance heavily relies on expert knowledge. To facilitate knowledge achievement, aircraft manufacturers and airlines normally provide structural manuals for consulting. On the other hand, computer-aided maintenance systems are adopted for improving daily maintenance efficiency. However, we find that grass-roots engineers of airlines still inevitably consult unstructured technical manuals from time to time, for example, when meeting an unusual problem or an unfamiliar type of aircraft. Achieving effective knowledge from unstructured data is inefficient and inconvenient. Aiming at the problem, we propose a knowledge-graph-based maintenance prototype system as a complementary solution. The knowledge graph we built is dedicated for unstructured manuals referring to flight control. We first build ontology to represent key concepts and relation types and then perform entity-relation extraction adopting a pipeline paradigm with natural language processing techniques. To fully utilize domain-specific features, we present a hybrid method consisting of dedicated rules and a machine learning model for entity recognition. As for relation extraction, we leverage a two-stage Bi-LSTM (bi-directional long short-term memory networks) based method to improve the extraction precision by solving a sample imbalanced problem. We conduct comprehensive experiments to study the technical feasibility on real manuals from airlines. The average precision of entity recognition reaches 85%, and the average precision of relation extraction comes to 61%. Finally, we design a flight control maintenance prototype system based on the knowledge graph constructed and a graph database Neo4j. The prototype system takes alarm messages represented in natural language as the input and returns maintenance suggestions to serve grass-roots engineers
Multipath Convolutional-Recursive Neural Networks for Object Recognition
Part 8: Pattern RecognitionInternational audienceExtracting good representations from images is essential for many computer vision tasks. While progress in deep learning shows the importance of learning hierarchical features, it is also important to learn features through multiple paths. This paper presents Multipath Convolutional-Recursive Neural Networks(M-CRNNs), a novel scheme which aims to learn image features from multiple paths using models based on combination of convolutional and recursive neural networks (CNNs and RNNs). CNNs learn low-level features, and RNNs, whose inputs are the outputs of the CNNs, learn the efficient high-level features. The final features of an image are the combination of the features from all the paths. The result shows that the features learned from M-CRNNs are a highly discriminative image representation that increases the precision in object recognition